{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "l0Y7_lgN4jzM"
      },
      "source": [
        "# **Bioinformatics Project - Computational Drug Discovery [Part 3] Descriptor Calculation and Dataset Preparation(Modified Version towards unknown smiles predcition purpose)**\n",
        "\n",
        "Chanin Nantasenamat\n",
        "\n",
        "[*'Data Professor' YouTube channel*](http://youtube.com/dataprofessor)\n",
        "\n",
        "Modified by quantaosun@gmail.com\n",
        "\n",
        "\n",
        "In **Part 3**, we will be calculating molecular descriptors that are essentially quantitative description of the compounds in the dataset. Finally, we will be preparing this into a dataset for subsequent model prediction in notebook 5.\n",
        "\n",
        "---"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "o-4IOizard4P"
      },
      "source": [
        "## **Download PaDEL-Descriptor**"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 1,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "H0mjQ2PcrSe5",
        "outputId": "2cb2b1f7-5770-40a3-fd2e-0f3c31ba8db1"
      },
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "--2022-01-09 12:11:01--  https://github.com/dataprofessor/bioinformatics/raw/master/padel.zip\n",
            "Resolving github.com (github.com)... 140.82.121.3\n",
            "Connecting to github.com (github.com)|140.82.121.3|:443... connected.\n",
            "HTTP request sent, awaiting response... 302 Found\n",
            "Location: https://raw.githubusercontent.com/dataprofessor/bioinformatics/master/padel.zip [following]\n",
            "--2022-01-09 12:11:01--  https://raw.githubusercontent.com/dataprofessor/bioinformatics/master/padel.zip\n",
            "Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...\n",
            "Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.\n",
            "HTTP request sent, awaiting response... 200 OK\n",
            "Length: 25768637 (25M) [application/zip]\n",
            "Saving to: ‘padel.zip’\n",
            "\n",
            "padel.zip           100%[===================>]  24.57M   163MB/s    in 0.2s    \n",
            "\n",
            "2022-01-09 12:11:02 (163 MB/s) - ‘padel.zip’ saved [25768637/25768637]\n",
            "\n",
            "--2022-01-09 12:11:02--  https://github.com/dataprofessor/bioinformatics/raw/master/padel.sh\n",
            "Resolving github.com (github.com)... 140.82.113.3\n",
            "Connecting to github.com (github.com)|140.82.113.3|:443... connected.\n",
            "HTTP request sent, awaiting response... 302 Found\n",
            "Location: https://raw.githubusercontent.com/dataprofessor/bioinformatics/master/padel.sh [following]\n",
            "--2022-01-09 12:11:02--  https://raw.githubusercontent.com/dataprofessor/bioinformatics/master/padel.sh\n",
            "Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...\n",
            "Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.\n",
            "HTTP request sent, awaiting response... 200 OK\n",
            "Length: 231 [text/plain]\n",
            "Saving to: ‘padel.sh’\n",
            "\n",
            "padel.sh            100%[===================>]     231  --.-KB/s    in 0s      \n",
            "\n",
            "2022-01-09 12:11:02 (9.03 MB/s) - ‘padel.sh’ saved [231/231]\n",
            "\n"
          ]
        }
      ],
      "source": [
        "! wget https://github.com/dataprofessor/bioinformatics/raw/master/padel.zip\n",
        "! wget https://github.com/dataprofessor/bioinformatics/raw/master/padel.sh"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 2,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "64HnTL4tS-nA",
        "outputId": "c54ab938-5f82-4389-fc13-7f9e7324af34"
      },
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "Archive:  padel.zip\n",
            "   creating: PaDEL-Descriptor/\n",
            "  inflating: __MACOSX/._PaDEL-Descriptor  \n",
            "  inflating: PaDEL-Descriptor/MACCSFingerprinter.xml  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/._MACCSFingerprinter.xml  \n",
            "  inflating: PaDEL-Descriptor/AtomPairs2DFingerprinter.xml  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/._AtomPairs2DFingerprinter.xml  \n",
            "  inflating: PaDEL-Descriptor/EStateFingerprinter.xml  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/._EStateFingerprinter.xml  \n",
            "  inflating: PaDEL-Descriptor/Fingerprinter.xml  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/._Fingerprinter.xml  \n",
            "  inflating: PaDEL-Descriptor/.DS_Store  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/._.DS_Store  \n",
            "   creating: PaDEL-Descriptor/license/\n",
            "  inflating: __MACOSX/PaDEL-Descriptor/._license  \n",
            "  inflating: PaDEL-Descriptor/KlekotaRothFingerprintCount.xml  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/._KlekotaRothFingerprintCount.xml  \n",
            "  inflating: PaDEL-Descriptor/config  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/._config  \n",
            "  inflating: PaDEL-Descriptor/PubchemFingerprinter.xml  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/._PubchemFingerprinter.xml  \n",
            "  inflating: PaDEL-Descriptor/ExtendedFingerprinter.xml  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/._ExtendedFingerprinter.xml  \n",
            "  inflating: PaDEL-Descriptor/KlekotaRothFingerprinter.xml  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/._KlekotaRothFingerprinter.xml  \n",
            "  inflating: PaDEL-Descriptor/GraphOnlyFingerprinter.xml  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/._GraphOnlyFingerprinter.xml  \n",
            "  inflating: PaDEL-Descriptor/SubstructureFingerprinter.xml  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/._SubstructureFingerprinter.xml  \n",
            "  inflating: PaDEL-Descriptor/Descriptors.xls  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/._Descriptors.xls  \n",
            "   creating: PaDEL-Descriptor/lib/\n",
            "  inflating: __MACOSX/PaDEL-Descriptor/._lib  \n",
            "  inflating: PaDEL-Descriptor/PaDEL-Descriptor.jar  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/._PaDEL-Descriptor.jar  \n",
            "  inflating: PaDEL-Descriptor/SubstructureFingerprintCount.xml  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/._SubstructureFingerprintCount.xml  \n",
            "  inflating: PaDEL-Descriptor/AtomPairs2DFingerprintCount.xml  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/._AtomPairs2DFingerprintCount.xml  \n",
            "  inflating: PaDEL-Descriptor/descriptors.xml  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/._descriptors.xml  \n",
            "  inflating: PaDEL-Descriptor/license/lgpl-2.1.txt  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/license/._lgpl-2.1.txt  \n",
            "  inflating: PaDEL-Descriptor/license/LICENSE.txt  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/license/._LICENSE.txt  \n",
            "  inflating: PaDEL-Descriptor/license/README - CDK  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/license/._README - CDK  \n",
            "  inflating: PaDEL-Descriptor/license/lgpl.license  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/license/._lgpl.license  \n",
            "  inflating: PaDEL-Descriptor/lib/ambit2-core-2.4.7-SNAPSHOT(3).jar  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/lib/._ambit2-core-2.4.7-SNAPSHOT(3).jar  \n",
            "  inflating: PaDEL-Descriptor/lib/libPaDEL-Jobs(6).jar  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/lib/._libPaDEL-Jobs(6).jar  \n",
            "  inflating: PaDEL-Descriptor/lib/libPaDEL.jar  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/lib/._libPaDEL.jar  \n",
            "  inflating: PaDEL-Descriptor/lib/jgrapht-0.6.0(4).jar  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/lib/._jgrapht-0.6.0(4).jar  \n",
            "  inflating: PaDEL-Descriptor/lib/commons-cli-1.2(2).jar  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/lib/._commons-cli-1.2(2).jar  \n",
            "  inflating: PaDEL-Descriptor/lib/xom-1.1(1).jar  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/lib/._xom-1.1(1).jar  \n",
            "  inflating: PaDEL-Descriptor/lib/swing-worker-1.1.jar  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/lib/._swing-worker-1.1.jar  \n",
            "  inflating: PaDEL-Descriptor/lib/commons-cli-1.2(3).jar  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/lib/._commons-cli-1.2(3).jar  \n",
            "  inflating: PaDEL-Descriptor/lib/jgrapht-0.6.0(5).jar  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/lib/._jgrapht-0.6.0(5).jar  \n",
            "  inflating: PaDEL-Descriptor/lib/jama(1).jar  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/lib/._jama(1).jar  \n",
            "  inflating: PaDEL-Descriptor/lib/appframework-1.0.3.jar  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/lib/._appframework-1.0.3.jar  \n",
            "  inflating: PaDEL-Descriptor/lib/libPaDEL-Jobs(7).jar  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/lib/._libPaDEL-Jobs(7).jar  \n",
            "  inflating: PaDEL-Descriptor/lib/vecmath1.2-1.14.jar  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/lib/._vecmath1.2-1.14.jar  \n",
            "  inflating: PaDEL-Descriptor/lib/ambit2-smarts-2.4.7-SNAPSHOT(6).jar  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/lib/._ambit2-smarts-2.4.7-SNAPSHOT(6).jar  \n",
            "  inflating: PaDEL-Descriptor/lib/ambit2-core-2.4.7-SNAPSHOT(2).jar  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/lib/._ambit2-core-2.4.7-SNAPSHOT(2).jar  \n",
            "  inflating: PaDEL-Descriptor/lib/jama(6).jar  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/lib/._jama(6).jar  \n",
            "  inflating: PaDEL-Descriptor/lib/jgrapht-0.6.0(2).jar  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/lib/._jgrapht-0.6.0(2).jar  \n",
            "  inflating: PaDEL-Descriptor/lib/libPaDEL-Descriptor(3).jar  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/lib/._libPaDEL-Descriptor(3).jar  \n",
            "  inflating: PaDEL-Descriptor/lib/commons-cli-1.2(4).jar  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/lib/._commons-cli-1.2(4).jar  \n",
            "  inflating: PaDEL-Descriptor/lib/ambit2-base-2.4.7-SNAPSHOT.jar  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/lib/._ambit2-base-2.4.7-SNAPSHOT.jar  \n",
            "  inflating: PaDEL-Descriptor/lib/ambit2-smarts-2.4.7-SNAPSHOT(1).jar  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/lib/._ambit2-smarts-2.4.7-SNAPSHOT(1).jar  \n",
            "  inflating: PaDEL-Descriptor/lib/commons-cli-1.2.jar  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/lib/._commons-cli-1.2.jar  \n",
            "  inflating: PaDEL-Descriptor/lib/commons-cli-1.2(8).jar  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/lib/._commons-cli-1.2(8).jar  \n",
            "  inflating: PaDEL-Descriptor/lib/jgrapht-0.6.0.jar  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/lib/._jgrapht-0.6.0.jar  \n",
            "  inflating: PaDEL-Descriptor/lib/libPaDEL-Jobs(1).jar  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/lib/._libPaDEL-Jobs(1).jar  \n",
            "  inflating: PaDEL-Descriptor/lib/libPaDEL-Jobs.jar  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/lib/._libPaDEL-Jobs.jar  \n",
            "  inflating: PaDEL-Descriptor/lib/ambit2-core-2.4.7-SNAPSHOT(4).jar  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/lib/._ambit2-core-2.4.7-SNAPSHOT(4).jar  \n",
            "  inflating: PaDEL-Descriptor/lib/xom-1.1.jar  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/lib/._xom-1.1.jar  \n",
            "  inflating: PaDEL-Descriptor/lib/commons-cli-1.2(5).jar  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/lib/._commons-cli-1.2(5).jar  \n",
            "  inflating: PaDEL-Descriptor/lib/libPaDEL-Descriptor(2).jar  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/lib/._libPaDEL-Descriptor(2).jar  \n",
            "  inflating: PaDEL-Descriptor/lib/jgrapht-0.6.0(3).jar  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/lib/._jgrapht-0.6.0(3).jar  \n",
            "  inflating: PaDEL-Descriptor/lib/jama(7).jar  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/lib/._jama(7).jar  \n",
            "  inflating: PaDEL-Descriptor/lib/ambit2-core-2.4.7-SNAPSHOT.jar  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/lib/._ambit2-core-2.4.7-SNAPSHOT.jar  \n",
            "  inflating: PaDEL-Descriptor/lib/commons-cli-1.2(6).jar  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/lib/._commons-cli-1.2(6).jar  \n",
            "  inflating: PaDEL-Descriptor/lib/libPaDEL-Descriptor(1).jar  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/lib/._libPaDEL-Descriptor(1).jar  \n",
            "  inflating: PaDEL-Descriptor/lib/jama(4).jar  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/lib/._jama(4).jar  \n",
            "  inflating: PaDEL-Descriptor/lib/libPaDEL-Jobs(2).jar  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/lib/._libPaDEL-Jobs(2).jar  \n",
            "  inflating: PaDEL-Descriptor/lib/ambit2-smarts-2.4.7-SNAPSHOT(3).jar  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/lib/._ambit2-smarts-2.4.7-SNAPSHOT(3).jar  \n",
            "  inflating: PaDEL-Descriptor/lib/ambit2-smarts-2.4.7-SNAPSHOT(2).jar  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/lib/._ambit2-smarts-2.4.7-SNAPSHOT(2).jar  \n",
            "  inflating: PaDEL-Descriptor/lib/ambit2-smarts-2.4.7-SNAPSHOT.jar  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/lib/._ambit2-smarts-2.4.7-SNAPSHOT.jar  \n",
            "  inflating: PaDEL-Descriptor/lib/libPaDEL-Jobs(3).jar  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/lib/._libPaDEL-Jobs(3).jar  \n",
            "  inflating: PaDEL-Descriptor/lib/l2fprod-common-all(1).jar  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/lib/._l2fprod-common-all(1).jar  \n",
            "  inflating: PaDEL-Descriptor/lib/jama.jar  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/lib/._jama.jar  \n",
            "  inflating: PaDEL-Descriptor/lib/l2fprod-common-all.jar  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/lib/._l2fprod-common-all.jar  \n",
            "  inflating: PaDEL-Descriptor/lib/jama(5).jar  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/lib/._jama(5).jar  \n",
            "  inflating: PaDEL-Descriptor/lib/jgrapht-0.6.0(1).jar  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/lib/._jgrapht-0.6.0(1).jar  \n",
            "  inflating: PaDEL-Descriptor/lib/commons-cli-1.2(7).jar  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/lib/._commons-cli-1.2(7).jar  \n",
            "  inflating: PaDEL-Descriptor/lib/libPaDEL-Descriptor.jar  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/lib/._libPaDEL-Descriptor.jar  \n",
            "  inflating: PaDEL-Descriptor/lib/libPaDEL-Jobs(4).jar  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/lib/._libPaDEL-Jobs(4).jar  \n",
            "  inflating: PaDEL-Descriptor/lib/cdk-1.4.15.jar  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/lib/._cdk-1.4.15.jar  \n",
            "  inflating: PaDEL-Descriptor/lib/ambit2-smarts-2.4.7-SNAPSHOT(5).jar  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/lib/._ambit2-smarts-2.4.7-SNAPSHOT(5).jar  \n",
            "  inflating: PaDEL-Descriptor/lib/ambit2-core-2.4.7-SNAPSHOT(1).jar  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/lib/._ambit2-core-2.4.7-SNAPSHOT(1).jar  \n",
            "  inflating: PaDEL-Descriptor/lib/libPaDEL-Jobs(8).jar  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/lib/._libPaDEL-Jobs(8).jar  \n",
            "  inflating: PaDEL-Descriptor/lib/jgrapht-0.6.0(6).jar  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/lib/._jgrapht-0.6.0(6).jar  \n",
            "  inflating: PaDEL-Descriptor/lib/jama(2).jar  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/lib/._jama(2).jar  \n",
            "  inflating: PaDEL-Descriptor/lib/jama(3).jar  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/lib/._jama(3).jar  \n",
            "  inflating: PaDEL-Descriptor/lib/commons-cli-1.2(1).jar  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/lib/._commons-cli-1.2(1).jar  \n",
            "  inflating: PaDEL-Descriptor/lib/guava-17.0.jar  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/lib/._guava-17.0.jar  \n",
            "  inflating: PaDEL-Descriptor/lib/ambit2-smarts-2.4.7-SNAPSHOT(4).jar  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/lib/._ambit2-smarts-2.4.7-SNAPSHOT(4).jar  \n",
            "  inflating: PaDEL-Descriptor/lib/libPaDEL-Jobs(5).jar  \n",
            "  inflating: __MACOSX/PaDEL-Descriptor/lib/._libPaDEL-Jobs(5).jar  \n"
          ]
        }
      ],
      "source": [
        "! unzip padel.zip"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "QmxXXFa4wTNG"
      },
      "source": [
        "## **Load unknown smiles data, called \"example.txt\"**"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "fcBvxkPWKFRV"
      },
      "source": [
        "Download the curated ChEMBL bioactivity data that has been pre-processed from Parts 1 and 2 of this Bioinformatics Project series. Here we will be using the **bioactivity_data_3class_pIC50.csv** file that essentially contain the pIC50 values that we will be using for building a regression model."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 3,
      "metadata": {
        "id": "Fpu5C7HlwV9s"
      },
      "outputs": [],
      "source": [
        "import pandas as pd"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "BB5G4I1Q2VN5"
      },
      "source": [
        "# We will convert a raw txt file with two lines of smiles to a standard dataframe by a csv intermediate"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 4,
      "metadata": {
        "id": "GCcE8J5XwjtB"
      },
      "outputs": [],
      "source": [
        "df3 = pd.read_csv('/content/sars-cov-2_bioactivity_data_processed_2classes_pIC50.csv')"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 5,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 540
        },
        "id": "60z_N6egNiSJ",
        "outputId": "72050f9a-4995-49e6-8aa9-d454ea63490a"
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/html": [
              "\n",
              "  <div id=\"df-8b040e8b-e66e-4da3-8768-bb6a6032911f\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>Unnamed: 0</th>\n",
              "      <th>molecule_chembl_id</th>\n",
              "      <th>canonical_smiles</th>\n",
              "      <th>bioactivity_class</th>\n",
              "      <th>pIC50</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>10</td>\n",
              "      <td>CHEMBL1751</td>\n",
              "      <td>C[C@H]1O[C@@H](O[C@H]2[C@@H](O)C[C@H](O[C@H]3[...</td>\n",
              "      <td>active</td>\n",
              "      <td>6.721246</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>11</td>\n",
              "      <td>CHEMBL1526240</td>\n",
              "      <td>CC(C)=CCCC1(C)C=Cc2c(O)c3c(c(CC=C(C)C)c2O1)OC1...</td>\n",
              "      <td>inactive</td>\n",
              "      <td>4.873869</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>17</td>\n",
              "      <td>CHEMBL496</td>\n",
              "      <td>Oc1c(Cl)cc(Cl)c(Cl)c1Cc1c(O)c(Cl)cc(Cl)c1Cl</td>\n",
              "      <td>active</td>\n",
              "      <td>6.045757</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>28</td>\n",
              "      <td>CHEMBL1448</td>\n",
              "      <td>O=C(Nc1ccc([N+](=O)[O-])cc1Cl)c1cc(Cl)ccc1O</td>\n",
              "      <td>active</td>\n",
              "      <td>6.552842</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>34</td>\n",
              "      <td>CHEMBL1242</td>\n",
              "      <td>Nc1ccc(/N=N/c2ccccc2)c(N)n1</td>\n",
              "      <td>inactive</td>\n",
              "      <td>4.552842</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>...</th>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>134</th>\n",
              "      <td>278</td>\n",
              "      <td>CHEMBL3913746</td>\n",
              "      <td>CC(C)(O)CNc1ncc(-c2ccnc(Nc3ccnc(Cl)c3)n2)c(CC2...</td>\n",
              "      <td>active</td>\n",
              "      <td>6.397940</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>135</th>\n",
              "      <td>279</td>\n",
              "      <td>CHEMBL1254577</td>\n",
              "      <td>O=C(NCCN1CCC2(CC1)C(=O)NCN2c1cccc(F)c1)c1ccc2c...</td>\n",
              "      <td>inactive</td>\n",
              "      <td>4.934047</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>136</th>\n",
              "      <td>282</td>\n",
              "      <td>CHEMBL3967196</td>\n",
              "      <td>CCCCn1c(NC(=O)c2cccs2)c(C#N)c2nc3ccccc3nc21</td>\n",
              "      <td>inactive</td>\n",
              "      <td>4.706637</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>137</th>\n",
              "      <td>283</td>\n",
              "      <td>CHEMBL601661</td>\n",
              "      <td>CNC(=O)Nc1ccc(-c2nc(N3CC4CCC(C3)O4)c3cnn(C4CCC...</td>\n",
              "      <td>inactive</td>\n",
              "      <td>4.665144</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>138</th>\n",
              "      <td>284</td>\n",
              "      <td>CHEMBL178334</td>\n",
              "      <td>CC(C)N1CCN(c2ccc(C(=O)c3c(-c4ccc(O)cc4)sc4cc(O...</td>\n",
              "      <td>inactive</td>\n",
              "      <td>4.524765</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "<p>139 rows × 5 columns</p>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-8b040e8b-e66e-4da3-8768-bb6a6032911f')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-8b040e8b-e66e-4da3-8768-bb6a6032911f button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-8b040e8b-e66e-4da3-8768-bb6a6032911f');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ],
            "text/plain": [
              "     Unnamed: 0 molecule_chembl_id  ... bioactivity_class     pIC50\n",
              "0            10         CHEMBL1751  ...            active  6.721246\n",
              "1            11      CHEMBL1526240  ...          inactive  4.873869\n",
              "2            17          CHEMBL496  ...            active  6.045757\n",
              "3            28         CHEMBL1448  ...            active  6.552842\n",
              "4            34         CHEMBL1242  ...          inactive  4.552842\n",
              "..          ...                ...  ...               ...       ...\n",
              "134         278      CHEMBL3913746  ...            active  6.397940\n",
              "135         279      CHEMBL1254577  ...          inactive  4.934047\n",
              "136         282      CHEMBL3967196  ...          inactive  4.706637\n",
              "137         283       CHEMBL601661  ...          inactive  4.665144\n",
              "138         284       CHEMBL178334  ...          inactive  4.524765\n",
              "\n",
              "[139 rows x 5 columns]"
            ]
          },
          "metadata": {},
          "execution_count": 5
        }
      ],
      "source": [
        "df3"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "vfl4o7FL28hj"
      },
      "source": [
        "Next the txt file will be save as a csv file, then import back again, to beautify the structure."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 7,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 424
        },
        "id": "AhGXxI3s1ocY",
        "outputId": "21f99289-35eb-472a-fc24-418baa125277"
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/html": [
              "\n",
              "  <div id=\"df-a4584520-b2ff-4067-ab0c-aa0bc44e7f12\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>canonical_smiles</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>C[C@H]1O[C@@H](O[C@H]2[C@@H](O)C[C@H](O[C@H]3[...</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>CC(C)=CCCC1(C)C=Cc2c(O)c3c(c(CC=C(C)C)c2O1)OC1...</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>Oc1c(Cl)cc(Cl)c(Cl)c1Cc1c(O)c(Cl)cc(Cl)c1Cl</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>O=C(Nc1ccc([N+](=O)[O-])cc1Cl)c1cc(Cl)ccc1O</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>Nc1ccc(/N=N/c2ccccc2)c(N)n1</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>...</th>\n",
              "      <td>...</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>134</th>\n",
              "      <td>CC(C)(O)CNc1ncc(-c2ccnc(Nc3ccnc(Cl)c3)n2)c(CC2...</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>135</th>\n",
              "      <td>O=C(NCCN1CCC2(CC1)C(=O)NCN2c1cccc(F)c1)c1ccc2c...</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>136</th>\n",
              "      <td>CCCCn1c(NC(=O)c2cccs2)c(C#N)c2nc3ccccc3nc21</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>137</th>\n",
              "      <td>CNC(=O)Nc1ccc(-c2nc(N3CC4CCC(C3)O4)c3cnn(C4CCC...</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>138</th>\n",
              "      <td>CC(C)N1CCN(c2ccc(C(=O)c3c(-c4ccc(O)cc4)sc4cc(O...</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "<p>139 rows × 1 columns</p>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-a4584520-b2ff-4067-ab0c-aa0bc44e7f12')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-a4584520-b2ff-4067-ab0c-aa0bc44e7f12 button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-a4584520-b2ff-4067-ab0c-aa0bc44e7f12');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ],
            "text/plain": [
              "                                      canonical_smiles\n",
              "0    C[C@H]1O[C@@H](O[C@H]2[C@@H](O)C[C@H](O[C@H]3[...\n",
              "1    CC(C)=CCCC1(C)C=Cc2c(O)c3c(c(CC=C(C)C)c2O1)OC1...\n",
              "2          Oc1c(Cl)cc(Cl)c(Cl)c1Cc1c(O)c(Cl)cc(Cl)c1Cl\n",
              "3          O=C(Nc1ccc([N+](=O)[O-])cc1Cl)c1cc(Cl)ccc1O\n",
              "4                          Nc1ccc(/N=N/c2ccccc2)c(N)n1\n",
              "..                                                 ...\n",
              "134  CC(C)(O)CNc1ncc(-c2ccnc(Nc3ccnc(Cl)c3)n2)c(CC2...\n",
              "135  O=C(NCCN1CCC2(CC1)C(=O)NCN2c1cccc(F)c1)c1ccc2c...\n",
              "136        CCCCn1c(NC(=O)c2cccs2)c(C#N)c2nc3ccccc3nc21\n",
              "137  CNC(=O)Nc1ccc(-c2nc(N3CC4CCC(C3)O4)c3cnn(C4CCC...\n",
              "138  CC(C)N1CCN(c2ccc(C(=O)c3c(-c4ccc(O)cc4)sc4cc(O...\n",
              "\n",
              "[139 rows x 1 columns]"
            ]
          },
          "metadata": {},
          "execution_count": 7
        }
      ],
      "source": [
        "selection=['canonical_smiles']\n",
        "df3[selection]"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 8,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 424
        },
        "id": "kBPFVwji11OJ",
        "outputId": "c609940c-abad-4d59-85a7-c687e8a89583"
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/html": [
              "\n",
              "  <div id=\"df-07acfa6d-ec41-4c60-8bd7-0b767829accf\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>canonical_smiles</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>C[C@H]1O[C@@H](O[C@H]2[C@@H](O)C[C@H](O[C@H]3[...</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>CC(C)=CCCC1(C)C=Cc2c(O)c3c(c(CC=C(C)C)c2O1)OC1...</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>Oc1c(Cl)cc(Cl)c(Cl)c1Cc1c(O)c(Cl)cc(Cl)c1Cl</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>O=C(Nc1ccc([N+](=O)[O-])cc1Cl)c1cc(Cl)ccc1O</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>Nc1ccc(/N=N/c2ccccc2)c(N)n1</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>...</th>\n",
              "      <td>...</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>134</th>\n",
              "      <td>CC(C)(O)CNc1ncc(-c2ccnc(Nc3ccnc(Cl)c3)n2)c(CC2...</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>135</th>\n",
              "      <td>O=C(NCCN1CCC2(CC1)C(=O)NCN2c1cccc(F)c1)c1ccc2c...</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>136</th>\n",
              "      <td>CCCCn1c(NC(=O)c2cccs2)c(C#N)c2nc3ccccc3nc21</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>137</th>\n",
              "      <td>CNC(=O)Nc1ccc(-c2nc(N3CC4CCC(C3)O4)c3cnn(C4CCC...</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>138</th>\n",
              "      <td>CC(C)N1CCN(c2ccc(C(=O)c3c(-c4ccc(O)cc4)sc4cc(O...</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "<p>139 rows × 1 columns</p>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-07acfa6d-ec41-4c60-8bd7-0b767829accf')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-07acfa6d-ec41-4c60-8bd7-0b767829accf button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-07acfa6d-ec41-4c60-8bd7-0b767829accf');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ],
            "text/plain": [
              "                                      canonical_smiles\n",
              "0    C[C@H]1O[C@@H](O[C@H]2[C@@H](O)C[C@H](O[C@H]3[...\n",
              "1    CC(C)=CCCC1(C)C=Cc2c(O)c3c(c(CC=C(C)C)c2O1)OC1...\n",
              "2          Oc1c(Cl)cc(Cl)c(Cl)c1Cc1c(O)c(Cl)cc(Cl)c1Cl\n",
              "3          O=C(Nc1ccc([N+](=O)[O-])cc1Cl)c1cc(Cl)ccc1O\n",
              "4                          Nc1ccc(/N=N/c2ccccc2)c(N)n1\n",
              "..                                                 ...\n",
              "134  CC(C)(O)CNc1ncc(-c2ccnc(Nc3ccnc(Cl)c3)n2)c(CC2...\n",
              "135  O=C(NCCN1CCC2(CC1)C(=O)NCN2c1cccc(F)c1)c1ccc2c...\n",
              "136        CCCCn1c(NC(=O)c2cccs2)c(C#N)c2nc3ccccc3nc21\n",
              "137  CNC(=O)Nc1ccc(-c2nc(N3CC4CCC(C3)O4)c3cnn(C4CCC...\n",
              "138  CC(C)N1CCN(c2ccc(C(=O)c3c(-c4ccc(O)cc4)sc4cc(O...\n",
              "\n",
              "[139 rows x 1 columns]"
            ]
          },
          "metadata": {},
          "execution_count": 8
        }
      ],
      "source": [
        "df4_selection = df3[selection]\n",
        "df4_selection"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 9,
      "metadata": {
        "id": "cly4Gagw2A2S"
      },
      "outputs": [],
      "source": [
        "df4_selection.to_csv('molecule.smi', sep= '\\t', index=False, header=False)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 10,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "nRSCoPVDSkf5",
        "outputId": "56b8217c-0b36-448e-b80d-a93bcb99b73a"
      },
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "C[C@H]1O[C@@H](O[C@H]2[C@@H](O)C[C@H](O[C@H]3[C@@H](O)C[C@H](O[C@H]4CC[C@]5(C)[C@H]6C[C@@H](O)[C@]7(C)[C@@H](C8=CC(=O)OC8)CC[C@]7(O)[C@@H]6CC[C@@H]5C4)O[C@@H]3C)O[C@@H]2C)C[C@H](O)[C@@H]1O\n",
            "CC(C)=CCCC1(C)C=Cc2c(O)c3c(c(CC=C(C)C)c2O1)OC12C(=CC4CC1C(C)(C)OC2(C/C=C(/C)C(=O)O)C4O)C3=O\n",
            "Oc1c(Cl)cc(Cl)c(Cl)c1Cc1c(O)c(Cl)cc(Cl)c1Cl\n",
            "O=C(Nc1ccc([N+](=O)[O-])cc1Cl)c1cc(Cl)ccc1O\n",
            "Nc1ccc(/N=N/c2ccccc2)c(N)n1\n"
          ]
        }
      ],
      "source": [
        "! cat molecule.smi | head -5"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 11,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "GlYaJ9pzUGjS",
        "outputId": "875c7af2-3bf5-40cb-c44c-4c6d85763a20"
      },
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "139\n"
          ]
        }
      ],
      "source": [
        "! cat molecule.smi | wc -l"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "YzN_S4Quro5S"
      },
      "source": [
        "## **Calculate fingerprint descriptors, for the unknown smiles**\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "JsgTV-ByxdMa"
      },
      "source": [
        "### **Calculate PaDEL descriptors**"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 12,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "hSCopQvEiSMj",
        "outputId": "5cad602b-c991-4842-9aee-acd4e32648c4"
      },
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "java -Xms1G -Xmx1G -Djava.awt.headless=true -jar ./PaDEL-Descriptor/PaDEL-Descriptor.jar -removesalt -standardizenitro -fingerprints -descriptortypes ./PaDEL-Descriptor/PubchemFingerprinter.xml -dir ./ -file descriptors_output.csv\n"
          ]
        }
      ],
      "source": [
        "! cat padel.sh"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 13,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "6kN9jrGpS5nE",
        "outputId": "29ffaa78-5085-40cc-e812-224bb1b5e4cb"
      },
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "Processing AUTOGEN_molecule_1 in molecule.smi (1/139). \n",
            "Processing AUTOGEN_molecule_2 in molecule.smi (2/139). \n",
            "Processing AUTOGEN_molecule_3 in molecule.smi (3/139). Average speed: 5.57 s/mol.\n",
            "Processing AUTOGEN_molecule_4 in molecule.smi (4/139). Average speed: 3.22 s/mol.\n",
            "Processing AUTOGEN_molecule_5 in molecule.smi (5/139). Average speed: 2.32 s/mol.\n",
            "Processing AUTOGEN_molecule_6 in molecule.smi (6/139). Average speed: 1.78 s/mol.\n",
            "Processing AUTOGEN_molecule_7 in molecule.smi (7/139). Average speed: 1.47 s/mol.\n",
            "Processing AUTOGEN_molecule_8 in molecule.smi (8/139). Average speed: 1.35 s/mol.\n",
            "Processing AUTOGEN_molecule_9 in molecule.smi (9/139). Average speed: 1.26 s/mol.\n",
            "Processing AUTOGEN_molecule_10 in molecule.smi (10/139). Average speed: 1.12 s/mol.\n",
            "Processing AUTOGEN_molecule_11 in molecule.smi (11/139). Average speed: 1.01 s/mol.\n",
            "Processing AUTOGEN_molecule_12 in molecule.smi (12/139). Average speed: 0.97 s/mol.\n",
            "Processing AUTOGEN_molecule_13 in molecule.smi (13/139). Average speed: 0.90 s/mol.\n",
            "Processing AUTOGEN_molecule_14 in molecule.smi (14/139). Average speed: 0.86 s/mol.\n",
            "Processing AUTOGEN_molecule_15 in molecule.smi (15/139). Average speed: 0.81 s/mol.\n",
            "Processing AUTOGEN_molecule_16 in molecule.smi (16/139). Average speed: 0.77 s/mol.\n",
            "Processing AUTOGEN_molecule_17 in molecule.smi (17/139). Average speed: 0.72 s/mol.\n",
            "Processing AUTOGEN_molecule_19 in molecule.smi (19/139). Average speed: 0.68 s/mol.\n",
            "Processing AUTOGEN_molecule_18 in molecule.smi (18/139). Average speed: 0.71 s/mol.\n",
            "Processing AUTOGEN_molecule_21 in molecule.smi (21/139). Average speed: 0.64 s/mol.\n",
            "Processing AUTOGEN_molecule_20 in molecule.smi (20/139). Average speed: 0.67 s/mol.\n",
            "Processing AUTOGEN_molecule_22 in molecule.smi (22/139). Average speed: 0.63 s/mol.\n",
            "Processing AUTOGEN_molecule_23 in molecule.smi (23/139). Average speed: 0.62 s/mol.\n",
            "Processing AUTOGEN_molecule_25 in molecule.smi (25/139). Average speed: 0.58 s/mol.\n",
            "Processing AUTOGEN_molecule_24 in molecule.smi (24/139). Average speed: 0.58 s/mol.\n",
            "Processing AUTOGEN_molecule_26 in molecule.smi (26/139). Average speed: 0.60 s/mol.\n",
            "Processing AUTOGEN_molecule_27 in molecule.smi (27/139). Average speed: 0.57 s/mol.\n",
            "Processing AUTOGEN_molecule_28 in molecule.smi (28/139). Average speed: 0.55 s/mol.\n",
            "Processing AUTOGEN_molecule_29 in molecule.smi (29/139). Average speed: 0.54 s/mol.\n",
            "Processing AUTOGEN_molecule_30 in molecule.smi (30/139). Average speed: 0.53 s/mol.\n",
            "Processing AUTOGEN_molecule_31 in molecule.smi (31/139). Average speed: 0.53 s/mol.\n",
            "Processing AUTOGEN_molecule_32 in molecule.smi (32/139). Average speed: 0.51 s/mol.\n",
            "Processing AUTOGEN_molecule_33 in molecule.smi (33/139). Average speed: 0.51 s/mol.\n",
            "Processing AUTOGEN_molecule_34 in molecule.smi (34/139). Average speed: 0.50 s/mol.\n",
            "Processing AUTOGEN_molecule_35 in molecule.smi (35/139). Average speed: 0.50 s/mol.\n",
            "Processing AUTOGEN_molecule_36 in molecule.smi (36/139). Average speed: 0.49 s/mol.\n",
            "Processing AUTOGEN_molecule_37 in molecule.smi (37/139). Average speed: 0.48 s/mol.\n",
            "Processing AUTOGEN_molecule_38 in molecule.smi (38/139). Average speed: 0.48 s/mol.\n",
            "Processing AUTOGEN_molecule_39 in molecule.smi (39/139). Average speed: 0.47 s/mol.\n",
            "Processing AUTOGEN_molecule_41 in molecule.smi (41/139). Average speed: 0.46 s/mol.\n",
            "Processing AUTOGEN_molecule_40 in molecule.smi (40/139). Average speed: 0.47 s/mol.\n",
            "Processing AUTOGEN_molecule_42 in molecule.smi (42/139). Average speed: 0.46 s/mol.\n",
            "Processing AUTOGEN_molecule_43 in molecule.smi (43/139). Average speed: 0.45 s/mol.\n",
            "Processing AUTOGEN_molecule_45 in molecule.smi (45/139). Average speed: 0.44 s/mol.\n",
            "Processing AUTOGEN_molecule_44 in molecule.smi (44/139). Average speed: 0.45 s/mol.\n",
            "Processing AUTOGEN_molecule_46 in molecule.smi (46/139). Average speed: 0.44 s/mol.\n",
            "Processing AUTOGEN_molecule_47 in molecule.smi (47/139). Average speed: 0.44 s/mol.\n",
            "Processing AUTOGEN_molecule_48 in molecule.smi (48/139). Average speed: 0.44 s/mol.\n",
            "Processing AUTOGEN_molecule_49 in molecule.smi (49/139). Average speed: 0.43 s/mol.\n",
            "Processing AUTOGEN_molecule_50 in molecule.smi (50/139). Average speed: 0.43 s/mol.\n",
            "Processing AUTOGEN_molecule_51 in molecule.smi (51/139). Average speed: 0.43 s/mol.\n",
            "Processing AUTOGEN_molecule_53 in molecule.smi (53/139). Average speed: 0.43 s/mol.\n",
            "Processing AUTOGEN_molecule_52 in molecule.smi (52/139). Average speed: 0.42 s/mol.\n",
            "Processing AUTOGEN_molecule_54 in molecule.smi (54/139). Average speed: 0.42 s/mol.\n",
            "Processing AUTOGEN_molecule_55 in molecule.smi (55/139). Average speed: 0.42 s/mol.\n",
            "Processing AUTOGEN_molecule_56 in molecule.smi (56/139). Average speed: 0.42 s/mol.\n",
            "Processing AUTOGEN_molecule_57 in molecule.smi (57/139). Average speed: 0.42 s/mol.\n",
            "Processing AUTOGEN_molecule_58 in molecule.smi (58/139). Average speed: 0.41 s/mol.\n",
            "Processing AUTOGEN_molecule_60 in molecule.smi (60/139). Average speed: 0.41 s/mol.\n",
            "Processing AUTOGEN_molecule_59 in molecule.smi (59/139). Average speed: 0.41 s/mol.\n",
            "Processing AUTOGEN_molecule_62 in molecule.smi (62/139). Average speed: 0.40 s/mol.\n",
            "Processing AUTOGEN_molecule_61 in molecule.smi (61/139). Average speed: 0.40 s/mol.\n",
            "Processing AUTOGEN_molecule_63 in molecule.smi (63/139). Average speed: 0.40 s/mol.\n",
            "Processing AUTOGEN_molecule_64 in molecule.smi (64/139). Average speed: 0.40 s/mol.\n",
            "Processing AUTOGEN_molecule_65 in molecule.smi (65/139). Average speed: 0.39 s/mol.\n",
            "Processing AUTOGEN_molecule_66 in molecule.smi (66/139). Average speed: 0.39 s/mol.\n",
            "Processing AUTOGEN_molecule_67 in molecule.smi (67/139). Average speed: 0.39 s/mol.\n",
            "Processing AUTOGEN_molecule_68 in molecule.smi (68/139). Average speed: 0.40 s/mol.\n",
            "Processing AUTOGEN_molecule_70 in molecule.smi (70/139). Average speed: 0.40 s/mol.\n",
            "Processing AUTOGEN_molecule_69 in molecule.smi (69/139). Average speed: 0.40 s/mol.\n",
            "Processing AUTOGEN_molecule_72 in molecule.smi (72/139). Average speed: 0.40 s/mol.\n",
            "Processing AUTOGEN_molecule_71 in molecule.smi (71/139). Average speed: 0.40 s/mol.\n",
            "Processing AUTOGEN_molecule_73 in molecule.smi (73/139). Average speed: 0.40 s/mol.\n",
            "Processing AUTOGEN_molecule_75 in molecule.smi (75/139). Average speed: 0.40 s/mol.\n",
            "Processing AUTOGEN_molecule_74 in molecule.smi (74/139). Average speed: 0.40 s/mol.\n",
            "Processing AUTOGEN_molecule_76 in molecule.smi (76/139). Average speed: 0.40 s/mol.\n",
            "Processing AUTOGEN_molecule_77 in molecule.smi (77/139). Average speed: 0.40 s/mol.\n",
            "Processing AUTOGEN_molecule_78 in molecule.smi (78/139). Average speed: 0.39 s/mol.\n",
            "Processing AUTOGEN_molecule_79 in molecule.smi (79/139). Average speed: 0.39 s/mol.\n",
            "Processing AUTOGEN_molecule_80 in molecule.smi (80/139). Average speed: 0.39 s/mol.\n",
            "Processing AUTOGEN_molecule_81 in molecule.smi (81/139). Average speed: 0.39 s/mol.\n",
            "Processing AUTOGEN_molecule_82 in molecule.smi (82/139). Average speed: 0.38 s/mol.\n",
            "Processing AUTOGEN_molecule_83 in molecule.smi (83/139). Average speed: 0.38 s/mol.\n",
            "Processing AUTOGEN_molecule_84 in molecule.smi (84/139). Average speed: 0.38 s/mol.\n",
            "Processing AUTOGEN_molecule_85 in molecule.smi (85/139). Average speed: 0.38 s/mol.\n",
            "Processing AUTOGEN_molecule_86 in molecule.smi (86/139). Average speed: 0.38 s/mol.\n",
            "Processing AUTOGEN_molecule_87 in molecule.smi (87/139). Average speed: 0.38 s/mol.\n",
            "Processing AUTOGEN_molecule_88 in molecule.smi (88/139). Average speed: 0.38 s/mol.\n",
            "Processing AUTOGEN_molecule_90 in molecule.smi (90/139). Average speed: 0.38 s/mol.\n",
            "Processing AUTOGEN_molecule_89 in molecule.smi (89/139). Average speed: 0.38 s/mol.\n",
            "Processing AUTOGEN_molecule_91 in molecule.smi (91/139). Average speed: 0.38 s/mol.\n",
            "Processing AUTOGEN_molecule_92 in molecule.smi (92/139). Average speed: 0.38 s/mol.\n",
            "Processing AUTOGEN_molecule_93 in molecule.smi (93/139). Average speed: 0.38 s/mol.\n",
            "Processing AUTOGEN_molecule_94 in molecule.smi (94/139). Average speed: 0.38 s/mol.\n",
            "Processing AUTOGEN_molecule_95 in molecule.smi (95/139). Average speed: 0.39 s/mol.\n",
            "Processing AUTOGEN_molecule_96 in molecule.smi (96/139). Average speed: 0.39 s/mol.\n",
            "Processing AUTOGEN_molecule_97 in molecule.smi (97/139). Average speed: 0.39 s/mol.\n",
            "Processing AUTOGEN_molecule_98 in molecule.smi (98/139). Average speed: 0.39 s/mol.\n",
            "Processing AUTOGEN_molecule_99 in molecule.smi (99/139). Average speed: 0.40 s/mol.\n",
            "Processing AUTOGEN_molecule_100 in molecule.smi (100/139). Average speed: 0.40 s/mol.\n",
            "Processing AUTOGEN_molecule_101 in molecule.smi (101/139). Average speed: 0.40 s/mol.\n",
            "Processing AUTOGEN_molecule_102 in molecule.smi (102/139). Average speed: 0.40 s/mol.\n",
            "Processing AUTOGEN_molecule_103 in molecule.smi (103/139). Average speed: 0.40 s/mol.\n",
            "Processing AUTOGEN_molecule_105 in molecule.smi (105/139). Average speed: 0.40 s/mol.\n",
            "Processing AUTOGEN_molecule_104 in molecule.smi (104/139). Average speed: 0.40 s/mol.\n",
            "Processing AUTOGEN_molecule_106 in molecule.smi (106/139). Average speed: 0.40 s/mol.\n",
            "Processing AUTOGEN_molecule_107 in molecule.smi (107/139). Average speed: 0.40 s/mol.\n",
            "Processing AUTOGEN_molecule_109 in molecule.smi (109/139). Average speed: 0.39 s/mol.\n",
            "Processing AUTOGEN_molecule_108 in molecule.smi (108/139). Average speed: 0.40 s/mol.\n",
            "Processing AUTOGEN_molecule_110 in molecule.smi (110/139). Average speed: 0.39 s/mol.\n",
            "Processing AUTOGEN_molecule_111 in molecule.smi (111/139). Average speed: 0.39 s/mol.\n",
            "Processing AUTOGEN_molecule_112 in molecule.smi (112/139). Average speed: 0.39 s/mol.\n",
            "Processing AUTOGEN_molecule_113 in molecule.smi (113/139). Average speed: 0.39 s/mol.\n",
            "Processing AUTOGEN_molecule_114 in molecule.smi (114/139). Average speed: 0.39 s/mol.\n",
            "Processing AUTOGEN_molecule_115 in molecule.smi (115/139). Average speed: 0.39 s/mol.\n",
            "Processing AUTOGEN_molecule_116 in molecule.smi (116/139). Average speed: 0.39 s/mol.\n",
            "Processing AUTOGEN_molecule_117 in molecule.smi (117/139). Average speed: 0.39 s/mol.\n",
            "Processing AUTOGEN_molecule_118 in molecule.smi (118/139). Average speed: 0.39 s/mol.\n",
            "Processing AUTOGEN_molecule_119 in molecule.smi (119/139). Average speed: 0.39 s/mol.\n",
            "Processing AUTOGEN_molecule_120 in molecule.smi (120/139). Average speed: 0.39 s/mol.\n",
            "Processing AUTOGEN_molecule_121 in molecule.smi (121/139). Average speed: 0.38 s/mol.\n",
            "Processing AUTOGEN_molecule_122 in molecule.smi (122/139). Average speed: 0.38 s/mol.\n",
            "Processing AUTOGEN_molecule_123 in molecule.smi (123/139). Average speed: 0.38 s/mol.\n",
            "Processing AUTOGEN_molecule_124 in molecule.smi (124/139). Average speed: 0.38 s/mol.\n",
            "Processing AUTOGEN_molecule_125 in molecule.smi (125/139). Average speed: 0.39 s/mol.\n",
            "Processing AUTOGEN_molecule_126 in molecule.smi (126/139). Average speed: 0.38 s/mol.\n",
            "Processing AUTOGEN_molecule_127 in molecule.smi (127/139). Average speed: 0.38 s/mol.\n",
            "Processing AUTOGEN_molecule_128 in molecule.smi (128/139). Average speed: 0.38 s/mol.\n",
            "Processing AUTOGEN_molecule_130 in molecule.smi (130/139). Average speed: 0.38 s/mol.\n",
            "Processing AUTOGEN_molecule_129 in molecule.smi (129/139). Average speed: 0.38 s/mol.\n",
            "Processing AUTOGEN_molecule_132 in molecule.smi (132/139). Average speed: 0.38 s/mol.\n",
            "Processing AUTOGEN_molecule_131 in molecule.smi (131/139). Average speed: 0.38 s/mol.\n",
            "Processing AUTOGEN_molecule_134 in molecule.smi (134/139). Average speed: 0.37 s/mol.\n",
            "Processing AUTOGEN_molecule_133 in molecule.smi (133/139). Average speed: 0.37 s/mol.\n",
            "Processing AUTOGEN_molecule_135 in molecule.smi (135/139). Average speed: 0.37 s/mol.\n",
            "Processing AUTOGEN_molecule_136 in molecule.smi (136/139). Average speed: 0.37 s/mol.\n",
            "Processing AUTOGEN_molecule_137 in molecule.smi (137/139). Average speed: 0.37 s/mol.\n",
            "Processing AUTOGEN_molecule_138 in molecule.smi (138/139). Average speed: 0.37 s/mol.\n",
            "Processing AUTOGEN_molecule_139 in molecule.smi (139/139). Average speed: 0.37 s/mol.\n",
            "Descriptor calculation completed in 51.77 secs . Average speed: 0.37 s/mol.\n"
          ]
        }
      ],
      "source": [
        "! bash padel.sh"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 14,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "2p7rAVy_k_hH",
        "outputId": "35c83bec-c747-4a8e-8c6c-86cc790a4a57"
      },
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "total 25468\n",
            "-rw-r--r-- 1 root root   259355 Jan  9 12:14 descriptors_output.csv\n",
            "drwxr-xr-x 3 root root     4096 Jan  9 12:11 __MACOSX\n",
            "-rw-r--r-- 1 root root     8735 Jan  9 12:13 molecule.smi\n",
            "drwxrwxr-x 4 root root     4096 May 30  2020 PaDEL-Descriptor\n",
            "-rw-r--r-- 1 root root      231 Jan  9 12:11 padel.sh\n",
            "-rw-r--r-- 1 root root 25768637 Jan  9 12:11 padel.zip\n",
            "drwxr-xr-x 1 root root     4096 Dec 23 14:32 sample_data\n",
            "-rw-r--r-- 1 root root    14767 Jan  9 12:11 sars-cov-2_bioactivity_data_processed_2classes_pIC50.csv\n"
          ]
        }
      ],
      "source": [
        "! ls -l"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "gUMlPfFrxicj"
      },
      "source": [
        "## **Preparing the X matrices**"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "30aa4WP4ZA8M"
      },
      "source": [
        "### **X data matrix**"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 15,
      "metadata": {
        "id": "3g319qxVl7tY"
      },
      "outputs": [],
      "source": [
        "df3_X = pd.read_csv('descriptors_output.csv')"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 16,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 488
        },
        "id": "hBp1PTObFQDd",
        "outputId": "8f07dd95-b71d-4669-b8df-89b5db26738a"
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/html": [
              "\n",
              "  <div id=\"df-b9347ca8-0cbf-49f0-aee2-ea9b79355793\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>Name</th>\n",
              "      <th>PubchemFP0</th>\n",
              "      <th>PubchemFP1</th>\n",
              "      <th>PubchemFP2</th>\n",
              "      <th>PubchemFP3</th>\n",
              "      <th>PubchemFP4</th>\n",
              "      <th>PubchemFP5</th>\n",
              "      <th>PubchemFP6</th>\n",
              "      <th>PubchemFP7</th>\n",
              "      <th>PubchemFP8</th>\n",
              "      <th>PubchemFP9</th>\n",
              "      <th>PubchemFP10</th>\n",
              "      <th>PubchemFP11</th>\n",
              "      <th>PubchemFP12</th>\n",
              "      <th>PubchemFP13</th>\n",
              "      <th>PubchemFP14</th>\n",
              "      <th>PubchemFP15</th>\n",
              "      <th>PubchemFP16</th>\n",
              "      <th>PubchemFP17</th>\n",
              "      <th>PubchemFP18</th>\n",
              "      <th>PubchemFP19</th>\n",
              "      <th>PubchemFP20</th>\n",
              "      <th>PubchemFP21</th>\n",
              "      <th>PubchemFP22</th>\n",
              "      <th>PubchemFP23</th>\n",
              "      <th>PubchemFP24</th>\n",
              "      <th>PubchemFP25</th>\n",
              "      <th>PubchemFP26</th>\n",
              "      <th>PubchemFP27</th>\n",
              "      <th>PubchemFP28</th>\n",
              "      <th>PubchemFP29</th>\n",
              "      <th>PubchemFP30</th>\n",
              "      <th>PubchemFP31</th>\n",
              "      <th>PubchemFP32</th>\n",
              "      <th>PubchemFP33</th>\n",
              "      <th>PubchemFP34</th>\n",
              "      <th>PubchemFP35</th>\n",
              "      <th>PubchemFP36</th>\n",
              "      <th>PubchemFP37</th>\n",
              "      <th>PubchemFP38</th>\n",
              "      <th>...</th>\n",
              "      <th>PubchemFP841</th>\n",
              "      <th>PubchemFP842</th>\n",
              "      <th>PubchemFP843</th>\n",
              "      <th>PubchemFP844</th>\n",
              "      <th>PubchemFP845</th>\n",
              "      <th>PubchemFP846</th>\n",
              "      <th>PubchemFP847</th>\n",
              "      <th>PubchemFP848</th>\n",
              "      <th>PubchemFP849</th>\n",
              "      <th>PubchemFP850</th>\n",
              "      <th>PubchemFP851</th>\n",
              "      <th>PubchemFP852</th>\n",
              "      <th>PubchemFP853</th>\n",
              "      <th>PubchemFP854</th>\n",
              "      <th>PubchemFP855</th>\n",
              "      <th>PubchemFP856</th>\n",
              "      <th>PubchemFP857</th>\n",
              "      <th>PubchemFP858</th>\n",
              "      <th>PubchemFP859</th>\n",
              "      <th>PubchemFP860</th>\n",
              "      <th>PubchemFP861</th>\n",
              "      <th>PubchemFP862</th>\n",
              "      <th>PubchemFP863</th>\n",
              "      <th>PubchemFP864</th>\n",
              "      <th>PubchemFP865</th>\n",
              "      <th>PubchemFP866</th>\n",
              "      <th>PubchemFP867</th>\n",
              "      <th>PubchemFP868</th>\n",
              "      <th>PubchemFP869</th>\n",
              "      <th>PubchemFP870</th>\n",
              "      <th>PubchemFP871</th>\n",
              "      <th>PubchemFP872</th>\n",
              "      <th>PubchemFP873</th>\n",
              "      <th>PubchemFP874</th>\n",
              "      <th>PubchemFP875</th>\n",
              "      <th>PubchemFP876</th>\n",
              "      <th>PubchemFP877</th>\n",
              "      <th>PubchemFP878</th>\n",
              "      <th>PubchemFP879</th>\n",
              "      <th>PubchemFP880</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>AUTOGEN_molecule_2</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>...</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>AUTOGEN_molecule_3</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>...</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>AUTOGEN_molecule_4</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>...</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>AUTOGEN_molecule_1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>...</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>AUTOGEN_molecule_5</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>...</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>...</th>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>134</th>\n",
              "      <td>AUTOGEN_molecule_134</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>...</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>135</th>\n",
              "      <td>AUTOGEN_molecule_137</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>...</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>136</th>\n",
              "      <td>AUTOGEN_molecule_136</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>...</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>137</th>\n",
              "      <td>AUTOGEN_molecule_138</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>...</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>138</th>\n",
              "      <td>AUTOGEN_molecule_139</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>...</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "<p>139 rows × 882 columns</p>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-b9347ca8-0cbf-49f0-aee2-ea9b79355793')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-b9347ca8-0cbf-49f0-aee2-ea9b79355793 button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-b9347ca8-0cbf-49f0-aee2-ea9b79355793');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ],
            "text/plain": [
              "                     Name  PubchemFP0  ...  PubchemFP879  PubchemFP880\n",
              "0      AUTOGEN_molecule_2           1  ...             0             0\n",
              "1      AUTOGEN_molecule_3           1  ...             0             0\n",
              "2      AUTOGEN_molecule_4           1  ...             0             0\n",
              "3      AUTOGEN_molecule_1           1  ...             0             0\n",
              "4      AUTOGEN_molecule_5           1  ...             0             0\n",
              "..                    ...         ...  ...           ...           ...\n",
              "134  AUTOGEN_molecule_134           1  ...             0             0\n",
              "135  AUTOGEN_molecule_137           1  ...             0             0\n",
              "136  AUTOGEN_molecule_136           1  ...             0             0\n",
              "137  AUTOGEN_molecule_138           1  ...             0             0\n",
              "138  AUTOGEN_molecule_139           1  ...             0             0\n",
              "\n",
              "[139 rows x 882 columns]"
            ]
          },
          "metadata": {},
          "execution_count": 16
        }
      ],
      "source": [
        "df3_X"
      ]
    },
    {
      "cell_type": "code",
      "source": [
        "df3_X = df3_X.drop(columns=['Name'])\n",
        "df3_X"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 488
        },
        "id": "QX2UdupmxpD-",
        "outputId": "1e2cf709-7b50-473f-cb16-1ec77ec5c4c3"
      },
      "execution_count": 17,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/html": [
              "\n",
              "  <div id=\"df-53ef03c1-ac93-4a16-b859-8ab42e748a2b\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>PubchemFP0</th>\n",
              "      <th>PubchemFP1</th>\n",
              "      <th>PubchemFP2</th>\n",
              "      <th>PubchemFP3</th>\n",
              "      <th>PubchemFP4</th>\n",
              "      <th>PubchemFP5</th>\n",
              "      <th>PubchemFP6</th>\n",
              "      <th>PubchemFP7</th>\n",
              "      <th>PubchemFP8</th>\n",
              "      <th>PubchemFP9</th>\n",
              "      <th>PubchemFP10</th>\n",
              "      <th>PubchemFP11</th>\n",
              "      <th>PubchemFP12</th>\n",
              "      <th>PubchemFP13</th>\n",
              "      <th>PubchemFP14</th>\n",
              "      <th>PubchemFP15</th>\n",
              "      <th>PubchemFP16</th>\n",
              "      <th>PubchemFP17</th>\n",
              "      <th>PubchemFP18</th>\n",
              "      <th>PubchemFP19</th>\n",
              "      <th>PubchemFP20</th>\n",
              "      <th>PubchemFP21</th>\n",
              "      <th>PubchemFP22</th>\n",
              "      <th>PubchemFP23</th>\n",
              "      <th>PubchemFP24</th>\n",
              "      <th>PubchemFP25</th>\n",
              "      <th>PubchemFP26</th>\n",
              "      <th>PubchemFP27</th>\n",
              "      <th>PubchemFP28</th>\n",
              "      <th>PubchemFP29</th>\n",
              "      <th>PubchemFP30</th>\n",
              "      <th>PubchemFP31</th>\n",
              "      <th>PubchemFP32</th>\n",
              "      <th>PubchemFP33</th>\n",
              "      <th>PubchemFP34</th>\n",
              "      <th>PubchemFP35</th>\n",
              "      <th>PubchemFP36</th>\n",
              "      <th>PubchemFP37</th>\n",
              "      <th>PubchemFP38</th>\n",
              "      <th>PubchemFP39</th>\n",
              "      <th>...</th>\n",
              "      <th>PubchemFP841</th>\n",
              "      <th>PubchemFP842</th>\n",
              "      <th>PubchemFP843</th>\n",
              "      <th>PubchemFP844</th>\n",
              "      <th>PubchemFP845</th>\n",
              "      <th>PubchemFP846</th>\n",
              "      <th>PubchemFP847</th>\n",
              "      <th>PubchemFP848</th>\n",
              "      <th>PubchemFP849</th>\n",
              "      <th>PubchemFP850</th>\n",
              "      <th>PubchemFP851</th>\n",
              "      <th>PubchemFP852</th>\n",
              "      <th>PubchemFP853</th>\n",
              "      <th>PubchemFP854</th>\n",
              "      <th>PubchemFP855</th>\n",
              "      <th>PubchemFP856</th>\n",
              "      <th>PubchemFP857</th>\n",
              "      <th>PubchemFP858</th>\n",
              "      <th>PubchemFP859</th>\n",
              "      <th>PubchemFP860</th>\n",
              "      <th>PubchemFP861</th>\n",
              "      <th>PubchemFP862</th>\n",
              "      <th>PubchemFP863</th>\n",
              "      <th>PubchemFP864</th>\n",
              "      <th>PubchemFP865</th>\n",
              "      <th>PubchemFP866</th>\n",
              "      <th>PubchemFP867</th>\n",
              "      <th>PubchemFP868</th>\n",
              "      <th>PubchemFP869</th>\n",
              "      <th>PubchemFP870</th>\n",
              "      <th>PubchemFP871</th>\n",
              "      <th>PubchemFP872</th>\n",
              "      <th>PubchemFP873</th>\n",
              "      <th>PubchemFP874</th>\n",
              "      <th>PubchemFP875</th>\n",
              "      <th>PubchemFP876</th>\n",
              "      <th>PubchemFP877</th>\n",
              "      <th>PubchemFP878</th>\n",
              "      <th>PubchemFP879</th>\n",
              "      <th>PubchemFP880</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>...</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>...</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>...</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>...</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>...</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>...</th>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>134</th>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>...</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>135</th>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>...</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>136</th>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>...</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>137</th>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>...</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>138</th>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>...</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "<p>139 rows × 881 columns</p>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-53ef03c1-ac93-4a16-b859-8ab42e748a2b')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-53ef03c1-ac93-4a16-b859-8ab42e748a2b button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-53ef03c1-ac93-4a16-b859-8ab42e748a2b');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ],
            "text/plain": [
              "     PubchemFP0  PubchemFP1  ...  PubchemFP879  PubchemFP880\n",
              "0             1           1  ...             0             0\n",
              "1             1           0  ...             0             0\n",
              "2             1           1  ...             0             0\n",
              "3             1           1  ...             0             0\n",
              "4             1           1  ...             0             0\n",
              "..          ...         ...  ...           ...           ...\n",
              "134           1           1  ...             0             0\n",
              "135           1           1  ...             0             0\n",
              "136           1           1  ...             0             0\n",
              "137           1           1  ...             0             0\n",
              "138           1           1  ...             0             0\n",
              "\n",
              "[139 rows x 881 columns]"
            ]
          },
          "metadata": {},
          "execution_count": 17
        }
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        " # Y variable"
      ],
      "metadata": {
        "id": "TleL3rRUx3fG"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "df3_Y = df3['pIC50']\n",
        "df3_Y"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "o9VYwQadxxn1",
        "outputId": "4751a70a-6de2-46f5-fc77-f5cb8ed58e24"
      },
      "execution_count": 21,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "0      6.721246\n",
              "1      4.873869\n",
              "2      6.045757\n",
              "3      6.552842\n",
              "4      4.552842\n",
              "         ...   \n",
              "134    6.397940\n",
              "135    4.934047\n",
              "136    4.706637\n",
              "137    4.665144\n",
              "138    4.524765\n",
              "Name: pIC50, Length: 139, dtype: float64"
            ]
          },
          "metadata": {},
          "execution_count": 21
        }
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "# Combine X and Y"
      ],
      "metadata": {
        "id": "X2Vdez08yMb8"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "dataset3 = pd.concat([df3_X,df3_Y], axis=1)\n",
        "dataset3"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 488
        },
        "id": "LUrRRezayMph",
        "outputId": "e652549f-e5ce-4530-dc76-f9ce9de15279"
      },
      "execution_count": 22,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/html": [
              "\n",
              "  <div id=\"df-474dc339-df7b-4fd0-84ba-272e22d7dba4\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>PubchemFP0</th>\n",
              "      <th>PubchemFP1</th>\n",
              "      <th>PubchemFP2</th>\n",
              "      <th>PubchemFP3</th>\n",
              "      <th>PubchemFP4</th>\n",
              "      <th>PubchemFP5</th>\n",
              "      <th>PubchemFP6</th>\n",
              "      <th>PubchemFP7</th>\n",
              "      <th>PubchemFP8</th>\n",
              "      <th>PubchemFP9</th>\n",
              "      <th>PubchemFP10</th>\n",
              "      <th>PubchemFP11</th>\n",
              "      <th>PubchemFP12</th>\n",
              "      <th>PubchemFP13</th>\n",
              "      <th>PubchemFP14</th>\n",
              "      <th>PubchemFP15</th>\n",
              "      <th>PubchemFP16</th>\n",
              "      <th>PubchemFP17</th>\n",
              "      <th>PubchemFP18</th>\n",
              "      <th>PubchemFP19</th>\n",
              "      <th>PubchemFP20</th>\n",
              "      <th>PubchemFP21</th>\n",
              "      <th>PubchemFP22</th>\n",
              "      <th>PubchemFP23</th>\n",
              "      <th>PubchemFP24</th>\n",
              "      <th>PubchemFP25</th>\n",
              "      <th>PubchemFP26</th>\n",
              "      <th>PubchemFP27</th>\n",
              "      <th>PubchemFP28</th>\n",
              "      <th>PubchemFP29</th>\n",
              "      <th>PubchemFP30</th>\n",
              "      <th>PubchemFP31</th>\n",
              "      <th>PubchemFP32</th>\n",
              "      <th>PubchemFP33</th>\n",
              "      <th>PubchemFP34</th>\n",
              "      <th>PubchemFP35</th>\n",
              "      <th>PubchemFP36</th>\n",
              "      <th>PubchemFP37</th>\n",
              "      <th>PubchemFP38</th>\n",
              "      <th>PubchemFP39</th>\n",
              "      <th>...</th>\n",
              "      <th>PubchemFP842</th>\n",
              "      <th>PubchemFP843</th>\n",
              "      <th>PubchemFP844</th>\n",
              "      <th>PubchemFP845</th>\n",
              "      <th>PubchemFP846</th>\n",
              "      <th>PubchemFP847</th>\n",
              "      <th>PubchemFP848</th>\n",
              "      <th>PubchemFP849</th>\n",
              "      <th>PubchemFP850</th>\n",
              "      <th>PubchemFP851</th>\n",
              "      <th>PubchemFP852</th>\n",
              "      <th>PubchemFP853</th>\n",
              "      <th>PubchemFP854</th>\n",
              "      <th>PubchemFP855</th>\n",
              "      <th>PubchemFP856</th>\n",
              "      <th>PubchemFP857</th>\n",
              "      <th>PubchemFP858</th>\n",
              "      <th>PubchemFP859</th>\n",
              "      <th>PubchemFP860</th>\n",
              "      <th>PubchemFP861</th>\n",
              "      <th>PubchemFP862</th>\n",
              "      <th>PubchemFP863</th>\n",
              "      <th>PubchemFP864</th>\n",
              "      <th>PubchemFP865</th>\n",
              "      <th>PubchemFP866</th>\n",
              "      <th>PubchemFP867</th>\n",
              "      <th>PubchemFP868</th>\n",
              "      <th>PubchemFP869</th>\n",
              "      <th>PubchemFP870</th>\n",
              "      <th>PubchemFP871</th>\n",
              "      <th>PubchemFP872</th>\n",
              "      <th>PubchemFP873</th>\n",
              "      <th>PubchemFP874</th>\n",
              "      <th>PubchemFP875</th>\n",
              "      <th>PubchemFP876</th>\n",
              "      <th>PubchemFP877</th>\n",
              "      <th>PubchemFP878</th>\n",
              "      <th>PubchemFP879</th>\n",
              "      <th>PubchemFP880</th>\n",
              "      <th>pIC50</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>...</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>6.721246</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>...</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>4.873869</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>...</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>6.045757</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>...</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>6.552842</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>...</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>4.552842</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>...</th>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>134</th>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>...</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>6.397940</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>135</th>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>...</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>4.934047</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>136</th>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>...</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>4.706637</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>137</th>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>...</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>4.665144</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>138</th>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>...</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>4.524765</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "<p>139 rows × 882 columns</p>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-474dc339-df7b-4fd0-84ba-272e22d7dba4')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-474dc339-df7b-4fd0-84ba-272e22d7dba4 button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-474dc339-df7b-4fd0-84ba-272e22d7dba4');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ],
            "text/plain": [
              "     PubchemFP0  PubchemFP1  PubchemFP2  ...  PubchemFP879  PubchemFP880     pIC50\n",
              "0             1           1           1  ...             0             0  6.721246\n",
              "1             1           0           0  ...             0             0  4.873869\n",
              "2             1           1           0  ...             0             0  6.045757\n",
              "3             1           1           1  ...             0             0  6.552842\n",
              "4             1           1           0  ...             0             0  4.552842\n",
              "..          ...         ...         ...  ...           ...           ...       ...\n",
              "134           1           1           1  ...             0             0  6.397940\n",
              "135           1           1           1  ...             0             0  4.934047\n",
              "136           1           1           1  ...             0             0  4.706637\n",
              "137           1           1           1  ...             0             0  4.665144\n",
              "138           1           1           1  ...             0             0  4.524765\n",
              "\n",
              "[139 rows x 882 columns]"
            ]
          },
          "metadata": {},
          "execution_count": 22
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Nj_ugDTJ3WVh"
      },
      "source": [
        "# You may have noticed that we now have a very long bit of 882 descriptor set, but when we built the model earlier, there are much less descriptors. This mismatch will be addressed later."
      ]
    },
    {
      "cell_type": "code",
      "source": [
        "dataset3.to_csv('sars-cov-2_bioactivity_data_2class_pIC50_pubchem_fp.csv', index=False)"
      ],
      "metadata": {
        "id": "oM3yBN5Ayg-1"
      },
      "execution_count": 24,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "nFpLoNRHeRa6"
      },
      "source": [
        "# **Let's download the CSV file to your local computer for the Part 3B (Model Building).**"
      ]
    }
  ],
  "metadata": {
    "colab": {
      "collapsed_sections": [],
      "name": "3_build.ipynb",
      "provenance": []
    },
    "kernelspec": {
      "display_name": "Python 3",
      "language": "python",
      "name": "python3"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.8.5"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}